Skip to content

Diagnose log injection smoke test flakiness instead of masking it#11075

Draft
bm1549 wants to merge 4 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake
Draft

Diagnose log injection smoke test flakiness instead of masking it#11075
bm1549 wants to merge 4 commits intomasterfrom
brian.marks/fix-log-injection-smoke-test-flake

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Apr 9, 2026

What Does This Do

Adds diagnostic instrumentation to the check raw file injection smoke test so the next CI failure tells us the root cause instead of a bare "Condition not satisfied after 30s" with traceCount=0.

Changes to LogInjectionSmokeTest:

  1. waitForTraceCountAlive — checks process liveness on every poll iteration; if the process dies, fails immediately with exit code + last 20 lines of process output
  2. Enriched timeout errors — on timeout, dumps: process alive?, traceCount, RC polls received, last 30 lines of process output
  3. Reorder waitForTraceCount(4) before waitFor + assert waitFor return value

Motivation

CI Visibility data for the last 30 days on master shows 10 failures of check raw file injection:

Failure mode Count Line Duration Root cause
traceCount=0 at waitForTraceCount(2) 9/10 368 30.3s Unknown — no diagnostics
logLines.size()=3 at assertRawLogLinesWithInjection 1/10 229 8.3s Incomplete log file

The failure distribution is bimodal — successful runs complete in 3.5-8.7s (80 data points, zero above 9s), while failures sit at exactly 30.3s. There is nothing in between. This means the process either works or is totally broken — a timeout increase would just delay the same failure.

<9s:  ████████████████████████████████████████  80/80 passes
9-30s:                                           0 runs
30s:  █████████                                  9/10 failures (at timeout)

The current test is blind during the wait — it just polls traceCount in a loop. We don't know if the process crashed, hung during agent init, failed to connect to the test server, or something else entirely. This PR makes the next failure self-diagnosing.

Example output when process crashes:

Process exited with code 1 while waiting for 2 traces (received 0, RC polls: 3).
Last process output:
[dd.trace ...] ERROR ... NullPointerException during instrumentation
...

Example output on timeout (process alive but not sending traces):

Timed out waiting for 2 traces after 30s. traceCount=0, process.alive=true, RC polls received: 142.
Last process output:
[dd.trace ...] DEBUG ... Still loading instrumentations...
...

Additional Notes

  • Only LogInjectionSmokeTest.groovy is changed
  • No timeout increase — the 30s defaultPoll is kept as-is
  • All 11 historically flaky backends pass locally
  • rcClientMessages.size() tells us whether the agent connected to the test server at all (RC polls hit /v0.7/config every 200ms)

Contributor Checklist

tag: no release notes
tag: ai generated

🤖 Generated with Claude Code

The `check raw file injection` test has been flaking across 11+ logging
backend variants for months. CI Visibility data shows 90% of failures are
`traceCount=0` at `waitForTraceCount(2)` after exactly 30s — the JVM +
agent bytecode instrumentation simply takes >30s on overloaded CI machines.

Changes:
- Add `startupPoll` with 120s timeout for the initial `waitForTraceCount(2)`
  that covers JVM startup + agent init, giving 4x headroom over the current
  30s `defaultPoll`
- Add `waitForTraceCountAlive` that checks process liveness on each poll
  iteration, turning silent 30-120s timeouts into instant, actionable errors
  when the process crashes
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all traces are
  delivered while the process is still alive
- Assert `waitFor` return value for a clear error if the process hangs

tag: no release note

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added type: bug Bug report and fix comp: core Tracer core tag: no release notes Changes to exclude from release notes tag: ai generated Largely based on code generated by an AI or LLM labels Apr 9, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter bot commented Apr 9, 2026

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775836958
git_commit_sha b266e2d 9eb11aa
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9eb11aac61
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775838770 1775838770
ci_job_id 1586148684 1586148684
ci_pipeline_id 107145233 107145233
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-xowwlhrv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-xowwlhrv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 13 unstable metrics.

Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1063231
Total [baseline] (11.036 s) : 0, 11036293
Agent [candidate] (1.059 s) : 0, 1059136
Total [candidate] (11.063 s) : 0, 11063171
section appsec
Agent [baseline] (1.258 s) : 0, 1258131
Total [baseline] (11.191 s) : 0, 11191006
Agent [candidate] (1.248 s) : 0, 1247667
Total [candidate] (11.227 s) : 0, 11227484
section iast
Agent [baseline] (1.223 s) : 0, 1223111
Total [baseline] (11.405 s) : 0, 11404977
Agent [candidate] (1.232 s) : 0, 1232189
Total [candidate] (11.365 s) : 0, 11365187
section profiling
Agent [baseline] (1.186 s) : 0, 1185901
Total [baseline] (11.17 s) : 0, 11169917
Agent [candidate] (1.187 s) : 0, 1186759
Total [candidate] (11.109 s) : 0, 11109348
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.063 s -
Agent appsec 1.258 s 194.9 ms (18.3%)
Agent iast 1.223 s 159.88 ms (15.0%)
Agent profiling 1.186 s 122.67 ms (11.5%)
Total tracing 11.036 s -
Total appsec 11.191 s 154.713 ms (1.4%)
Total iast 11.405 s 368.684 ms (3.3%)
Total profiling 11.17 s 133.623 ms (1.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.059 s -
Agent appsec 1.248 s 188.531 ms (17.8%)
Agent iast 1.232 s 173.053 ms (16.3%)
Agent profiling 1.187 s 127.623 ms (12.0%)
Total tracing 11.063 s -
Total appsec 11.227 s 164.314 ms (1.5%)
Total iast 11.365 s 302.017 ms (2.7%)
Total profiling 11.109 s 46.178 ms (0.4%)
gantt
    title petclinic - break down per module: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.246 ms) : 0, 1246
crashtracking [candidate] (1.222 ms) : 0, 1222
BytebuddyAgent [baseline] (634.887 ms) : 0, 634887
BytebuddyAgent [candidate] (634.371 ms) : 0, 634371
AgentMeter [baseline] (29.693 ms) : 0, 29693
AgentMeter [candidate] (29.438 ms) : 0, 29438
GlobalTracer [baseline] (250.611 ms) : 0, 250611
GlobalTracer [candidate] (248.879 ms) : 0, 248879
AppSec [baseline] (32.273 ms) : 0, 32273
AppSec [candidate] (32.071 ms) : 0, 32071
Debugger [baseline] (60.375 ms) : 0, 60375
Debugger [candidate] (60.007 ms) : 0, 60007
Remote Config [baseline] (618.742 µs) : 0, 619
Remote Config [candidate] (601.417 µs) : 0, 601
Telemetry [baseline] (8.238 ms) : 0, 8238
Telemetry [candidate] (8.048 ms) : 0, 8048
Flare Poller [baseline] (9.016 ms) : 0, 9016
Flare Poller [candidate] (8.291 ms) : 0, 8291
section appsec
crashtracking [baseline] (1.24 ms) : 0, 1240
crashtracking [candidate] (1.216 ms) : 0, 1216
BytebuddyAgent [baseline] (668.022 ms) : 0, 668022
BytebuddyAgent [candidate] (661.304 ms) : 0, 661304
AgentMeter [baseline] (12.143 ms) : 0, 12143
AgentMeter [candidate] (12.001 ms) : 0, 12001
GlobalTracer [baseline] (250.991 ms) : 0, 250991
GlobalTracer [candidate] (248.86 ms) : 0, 248860
IAST [baseline] (24.769 ms) : 0, 24769
IAST [candidate] (24.581 ms) : 0, 24581
AppSec [baseline] (185.223 ms) : 0, 185223
AppSec [candidate] (184.798 ms) : 0, 184798
Debugger [baseline] (66.316 ms) : 0, 66316
Debugger [candidate] (65.923 ms) : 0, 65923
Remote Config [baseline] (635.029 µs) : 0, 635
Remote Config [candidate] (602.889 µs) : 0, 603
Telemetry [baseline] (8.695 ms) : 0, 8695
Telemetry [candidate] (8.52 ms) : 0, 8520
Flare Poller [baseline] (3.575 ms) : 0, 3575
Flare Poller [candidate] (3.564 ms) : 0, 3564
section iast
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.227 ms) : 0, 1227
BytebuddyAgent [baseline] (800.829 ms) : 0, 800829
BytebuddyAgent [candidate] (807.248 ms) : 0, 807248
AgentMeter [baseline] (11.353 ms) : 0, 11353
AgentMeter [candidate] (11.527 ms) : 0, 11527
GlobalTracer [baseline] (238.852 ms) : 0, 238852
GlobalTracer [candidate] (240.599 ms) : 0, 240599
IAST [baseline] (25.717 ms) : 0, 25717
IAST [candidate] (25.92 ms) : 0, 25920
AppSec [baseline] (31.724 ms) : 0, 31724
AppSec [candidate] (33.66 ms) : 0, 33660
Debugger [baseline] (58.496 ms) : 0, 58496
Debugger [candidate] (58.713 ms) : 0, 58713
Remote Config [baseline] (1.125 ms) : 0, 1125
Remote Config [candidate] (528.187 µs) : 0, 528
Telemetry [baseline] (13.995 ms) : 0, 13995
Telemetry [candidate] (12.749 ms) : 0, 12749
Flare Poller [baseline] (3.446 ms) : 0, 3446
Flare Poller [candidate] (3.605 ms) : 0, 3605
section profiling
crashtracking [baseline] (1.188 ms) : 0, 1188
crashtracking [candidate] (1.174 ms) : 0, 1174
BytebuddyAgent [baseline] (692.66 ms) : 0, 692660
BytebuddyAgent [candidate] (691.408 ms) : 0, 691408
AgentMeter [baseline] (9.111 ms) : 0, 9111
AgentMeter [candidate] (9.156 ms) : 0, 9156
GlobalTracer [baseline] (206.735 ms) : 0, 206735
GlobalTracer [candidate] (208.067 ms) : 0, 208067
AppSec [baseline] (32.373 ms) : 0, 32373
AppSec [candidate] (32.877 ms) : 0, 32877
Debugger [baseline] (65.748 ms) : 0, 65748
Debugger [candidate] (66.024 ms) : 0, 66024
Remote Config [baseline] (569.316 µs) : 0, 569
Remote Config [candidate] (577.357 µs) : 0, 577
Telemetry [baseline] (7.916 ms) : 0, 7916
Telemetry [candidate] (7.924 ms) : 0, 7924
Flare Poller [baseline] (3.613 ms) : 0, 3613
Flare Poller [candidate] (3.562 ms) : 0, 3562
ProfilingAgent [baseline] (94.466 ms) : 0, 94466
ProfilingAgent [candidate] (94.681 ms) : 0, 94681
Profiling [baseline] (95.042 ms) : 0, 95042
Profiling [candidate] (95.248 ms) : 0, 95248
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.058 s) : 0, 1058032
Total [baseline] (8.851 s) : 0, 8851483
Agent [candidate] (1.073 s) : 0, 1072922
Total [candidate] (8.882 s) : 0, 8881549
section iast
Agent [baseline] (1.232 s) : 0, 1231661
Total [baseline] (9.575 s) : 0, 9574595
Agent [candidate] (1.223 s) : 0, 1222672
Total [candidate] (9.556 s) : 0, 9556242
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.058 s -
Agent iast 1.232 s 173.63 ms (16.4%)
Total tracing 8.851 s -
Total iast 9.575 s 723.112 ms (8.2%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.073 s -
Agent iast 1.223 s 149.75 ms (14.0%)
Total tracing 8.882 s -
Total iast 9.556 s 674.693 ms (7.6%)
gantt
    title insecure-bank - break down per module: candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.236 ms) : 0, 1236
crashtracking [candidate] (1.251 ms) : 0, 1251
BytebuddyAgent [baseline] (632.931 ms) : 0, 632931
BytebuddyAgent [candidate] (643.785 ms) : 0, 643785
AgentMeter [baseline] (29.323 ms) : 0, 29323
AgentMeter [candidate] (29.811 ms) : 0, 29811
GlobalTracer [baseline] (248.94 ms) : 0, 248940
GlobalTracer [candidate] (252.393 ms) : 0, 252393
AppSec [baseline] (32.067 ms) : 0, 32067
AppSec [candidate] (32.692 ms) : 0, 32692
Debugger [baseline] (59.086 ms) : 0, 59086
Debugger [candidate] (60.144 ms) : 0, 60144
Remote Config [baseline] (599.948 µs) : 0, 600
Remote Config [candidate] (612.878 µs) : 0, 613
Telemetry [baseline] (8.057 ms) : 0, 8057
Telemetry [candidate] (8.309 ms) : 0, 8309
Flare Poller [baseline] (9.633 ms) : 0, 9633
Flare Poller [candidate] (7.433 ms) : 0, 7433
section iast
crashtracking [baseline] (1.239 ms) : 0, 1239
crashtracking [candidate] (1.221 ms) : 0, 1221
BytebuddyAgent [baseline] (809.019 ms) : 0, 809019
BytebuddyAgent [candidate] (801.161 ms) : 0, 801161
AgentMeter [baseline] (11.504 ms) : 0, 11504
AgentMeter [candidate] (11.372 ms) : 0, 11372
GlobalTracer [baseline] (239.112 ms) : 0, 239112
GlobalTracer [candidate] (239.026 ms) : 0, 239026
IAST [baseline] (25.766 ms) : 0, 25766
IAST [candidate] (25.794 ms) : 0, 25794
AppSec [baseline] (31.104 ms) : 0, 31104
AppSec [candidate] (30.055 ms) : 0, 30055
Debugger [baseline] (61.008 ms) : 0, 61008
Debugger [candidate] (61.004 ms) : 0, 61004
Remote Config [baseline] (1.13 ms) : 0, 1130
Remote Config [candidate] (532.636 µs) : 0, 533
Telemetry [baseline] (11.924 ms) : 0, 11924
Telemetry [candidate] (12.222 ms) : 0, 12222
Flare Poller [baseline] (3.431 ms) : 0, 3431
Flare Poller [candidate] (3.708 ms) : 0, 3708
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775836958
git_commit_sha b266e2d 9eb11aa
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9eb11aac61
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1775839240 1775839240
ci_job_id 1586148686 1586148686
ci_pipeline_id 107145233 107145233
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-unchtyed 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-unchtyed 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 3 performance regressions! Performance is the same for 16 metrics, 15 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast:high_load worse
[+55.894µs; +144.728µs] or [+2.199%; +5.694%]
unsure
[+100.066µs; +571.357µs] or [+1.340%; +7.650%]
unstable
[-214.047op/s; +91.422op/s] or [-15.284%; +6.528%]
2.642ms 7.805ms 1339.125op/s 2.542ms 7.469ms 1400.438op/s
scenario:load:petclinic:tracing:high_load better
[-1429.334µs; -448.428µs] or [-7.740%; -2.428%]
better
[-2.240ms; -0.853ms] or [-7.478%; -2.848%]
unstable
[-16.288op/s; +40.788op/s] or [-6.540%; +16.379%]
17.528ms 28.411ms 261.281op/s 18.466ms 29.958ms 249.031op/s
scenario:load:petclinic:profiling:high_load worse
[+1.664ms; +2.338ms] or [+9.096%; +12.780%]
worse
[+1.340ms; +2.816ms] or [+4.498%; +9.455%]
unstable
[-49.213op/s; +3.901op/s] or [-19.651%; +1.558%]
20.294ms 31.867ms 227.781op/s 18.294ms 29.789ms 250.438op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (19.387 ms) : 19191, 19582
.   : milestone, 19387,
appsec (18.72 ms) : 18531, 18908
.   : milestone, 18720,
code_origins (17.711 ms) : 17535, 17888
.   : milestone, 17711,
iast (17.956 ms) : 17778, 18134
.   : milestone, 17956,
profiling (18.634 ms) : 18449, 18818
.   : milestone, 18634,
tracing (18.743 ms) : 18555, 18930
.   : milestone, 18743,
section candidate
no_agent (18.585 ms) : 18393, 18777
.   : milestone, 18585,
appsec (18.922 ms) : 18731, 19112
.   : milestone, 18922,
code_origins (18.155 ms) : 17972, 18338
.   : milestone, 18155,
iast (18.685 ms) : 18499, 18872
.   : milestone, 18685,
profiling (20.501 ms) : 20294, 20708
.   : milestone, 20501,
tracing (17.86 ms) : 17686, 18034
.   : milestone, 17860,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 19.387 ms [19.191 ms, 19.582 ms] -
appsec 18.72 ms [18.531 ms, 18.908 ms] -666.786 µs (-3.4%)
code_origins 17.711 ms [17.535 ms, 17.888 ms] -1.675 ms (-8.6%)
iast 17.956 ms [17.778 ms, 18.134 ms] -1.43 ms (-7.4%)
profiling 18.634 ms [18.449 ms, 18.818 ms] -752.69 µs (-3.9%)
tracing 18.743 ms [18.555 ms, 18.93 ms] -643.806 µs (-3.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.585 ms [18.393 ms, 18.777 ms] -
appsec 18.922 ms [18.731 ms, 19.112 ms] 336.737 µs (1.8%)
code_origins 18.155 ms [17.972 ms, 18.338 ms] -430.348 µs (-2.3%)
iast 18.685 ms [18.499 ms, 18.872 ms] 100.225 µs (0.5%)
profiling 20.501 ms [20.294 ms, 20.708 ms] 1.916 ms (10.3%)
tracing 17.86 ms [17.686 ms, 18.034 ms] -725.653 µs (-3.9%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.241 ms) : 1228, 1254
.   : milestone, 1241,
iast (3.268 ms) : 3221, 3315
.   : milestone, 3268,
iast_FULL (5.975 ms) : 5914, 6036
.   : milestone, 5975,
iast_GLOBAL (3.678 ms) : 3616, 3739
.   : milestone, 3678,
profiling (2.13 ms) : 2109, 2151
.   : milestone, 2130,
tracing (1.934 ms) : 1917, 1950
.   : milestone, 1934,
section candidate
no_agent (1.283 ms) : 1270, 1296
.   : milestone, 1283,
iast (3.42 ms) : 3368, 3472
.   : milestone, 3420,
iast_FULL (6.126 ms) : 6064, 6188
.   : milestone, 6126,
iast_GLOBAL (3.638 ms) : 3580, 3696
.   : milestone, 3638,
profiling (2.219 ms) : 2197, 2241
.   : milestone, 2219,
tracing (1.871 ms) : 1856, 1887
.   : milestone, 1871,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.241 ms [1.228 ms, 1.254 ms] -
iast 3.268 ms [3.221 ms, 3.315 ms] 2.027 ms (163.4%)
iast_FULL 5.975 ms [5.914 ms, 6.036 ms] 4.734 ms (381.4%)
iast_GLOBAL 3.678 ms [3.616 ms, 3.739 ms] 2.437 ms (196.3%)
profiling 2.13 ms [2.109 ms, 2.151 ms] 889.01 µs (71.6%)
tracing 1.934 ms [1.917 ms, 1.95 ms] 692.788 µs (55.8%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.283 ms [1.27 ms, 1.296 ms] -
iast 3.42 ms [3.368 ms, 3.472 ms] 2.137 ms (166.5%)
iast_FULL 6.126 ms [6.064 ms, 6.188 ms] 4.843 ms (377.5%)
iast_GLOBAL 3.638 ms [3.58 ms, 3.696 ms] 2.355 ms (183.6%)
profiling 2.219 ms [2.197 ms, 2.241 ms] 935.903 µs (72.9%)
tracing 1.871 ms [1.856 ms, 1.887 ms] 588.454 µs (45.9%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master brian.marks/fix-log-injection-smoke-test-flake
git_commit_date 1775744045 1775836958
git_commit_sha b266e2d 9eb11aa
release_version 1.62.0-SNAPSHOT~b266e2d0c2 1.62.0-SNAPSHOT~9eb11aac61
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1775838899 1775838899
ci_job_id 1586148689 1586148689
ci_pipeline_id 107145233 107145233
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-7b269uvy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-7b269uvy 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.59 s) : 15590000, 15590000
.   : milestone, 15590000,
appsec (14.812 s) : 14812000, 14812000
.   : milestone, 14812000,
iast (17.946 s) : 17946000, 17946000
.   : milestone, 17946000,
iast_GLOBAL (18.062 s) : 18062000, 18062000
.   : milestone, 18062000,
profiling (15.24 s) : 15240000, 15240000
.   : milestone, 15240000,
tracing (14.828 s) : 14828000, 14828000
.   : milestone, 14828000,
section candidate
no_agent (15.138 s) : 15138000, 15138000
.   : milestone, 15138000,
appsec (15.346 s) : 15346000, 15346000
.   : milestone, 15346000,
iast (18.19 s) : 18190000, 18190000
.   : milestone, 18190000,
iast_GLOBAL (17.891 s) : 17891000, 17891000
.   : milestone, 17891000,
profiling (14.735 s) : 14735000, 14735000
.   : milestone, 14735000,
tracing (14.908 s) : 14908000, 14908000
.   : milestone, 14908000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.59 s [15.59 s, 15.59 s] -
appsec 14.812 s [14.812 s, 14.812 s] -778.0 ms (-5.0%)
iast 17.946 s [17.946 s, 17.946 s] 2.356 s (15.1%)
iast_GLOBAL 18.062 s [18.062 s, 18.062 s] 2.472 s (15.9%)
profiling 15.24 s [15.24 s, 15.24 s] -350.0 ms (-2.2%)
tracing 14.828 s [14.828 s, 14.828 s] -762.0 ms (-4.9%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.138 s [15.138 s, 15.138 s] -
appsec 15.346 s [15.346 s, 15.346 s] 208.0 ms (1.4%)
iast 18.19 s [18.19 s, 18.19 s] 3.052 s (20.2%)
iast_GLOBAL 17.891 s [17.891 s, 17.891 s] 2.753 s (18.2%)
profiling 14.735 s [14.735 s, 14.735 s] -403.0 ms (-2.7%)
tracing 14.908 s [14.908 s, 14.908 s] -230.0 ms (-1.5%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.62.0-SNAPSHOT~9eb11aac61, baseline=1.62.0-SNAPSHOT~b266e2d0c2
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.486 ms) : 1474, 1498
.   : milestone, 1486,
appsec (3.838 ms) : 3615, 4062
.   : milestone, 3838,
iast (2.267 ms) : 2197, 2336
.   : milestone, 2267,
iast_GLOBAL (2.312 ms) : 2242, 2382
.   : milestone, 2312,
profiling (2.101 ms) : 2045, 2156
.   : milestone, 2101,
tracing (2.08 ms) : 2026, 2134
.   : milestone, 2080,
section candidate
no_agent (1.486 ms) : 1474, 1497
.   : milestone, 1486,
appsec (3.838 ms) : 3616, 4060
.   : milestone, 3838,
iast (2.278 ms) : 2208, 2348
.   : milestone, 2278,
iast_GLOBAL (2.321 ms) : 2250, 2392
.   : milestone, 2321,
profiling (2.095 ms) : 2040, 2150
.   : milestone, 2095,
tracing (2.082 ms) : 2028, 2136
.   : milestone, 2082,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.474 ms, 1.498 ms] -
appsec 3.838 ms [3.615 ms, 4.062 ms] 2.352 ms (158.3%)
iast 2.267 ms [2.197 ms, 2.336 ms] 780.725 µs (52.5%)
iast_GLOBAL 2.312 ms [2.242 ms, 2.382 ms] 826.15 µs (55.6%)
profiling 2.101 ms [2.045 ms, 2.156 ms] 614.652 µs (41.4%)
tracing 2.08 ms [2.026 ms, 2.134 ms] 593.804 µs (40.0%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.486 ms [1.474 ms, 1.497 ms] -
appsec 3.838 ms [3.616 ms, 4.06 ms] 2.352 ms (158.3%)
iast 2.278 ms [2.208 ms, 2.348 ms] 792.12 µs (53.3%)
iast_GLOBAL 2.321 ms [2.25 ms, 2.392 ms] 835.051 µs (56.2%)
profiling 2.095 ms [2.04 ms, 2.15 ms] 609.511 µs (41.0%)
tracing 2.082 ms [2.028 ms, 2.136 ms] 595.779 µs (40.1%)

The `check raw file injection` test flakes across 11+ logging backend
variants. CI Visibility data shows the failure is bimodal — successful
runs complete in 3-9s, but failures sit at exactly 30s (the
PollingConditions timeout) with traceCount=0. Nothing in between. This
means the process either works or is totally broken — no amount of
timeout increase will help.

The current test is blind during the 30s wait — it just polls
traceCount with no diagnostics when the process crashes or hangs.

Changes:
- Add `waitForTraceCountAlive` that checks process liveness on every
  poll iteration. If the process dies, it fails immediately with the
  exit code, RC poll count, and last 20 lines of process output.
- On timeout, enrich the error with diagnostic state (process alive?,
  traceCount, RC polls received, last 30 lines of output) so the next
  CI failure tells us whether it's a crash, a hang, or a connectivity
  issue.
- Reorder `waitForTraceCount(4)` before `waitFor` to confirm all
  traces are delivered while the process is still alive.
- Assert `waitFor` return value for a clear error if the process hangs.

tag: no release notes

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 changed the title Fix log injection smoke test flakiness from startup timeout Diagnose log injection smoke test flakiness instead of masking it Apr 9, 2026
bm1549 and others added 2 commits April 10, 2026 11:58
The liveness check fired before the trace count check, so a normal
process exit after delivering all traces was treated as a failure.
Check traceCount >= count first and return early if satisfied.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
PollingConditions.eventually only retries AssertionError. The liveness
check was throwing AssertionError, so a dead process still waited the
full 30s timeout. Switch to RuntimeException so it propagates
immediately. Also narrow the catch from Throwable to AssertionError.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: core Tracer core tag: ai generated Largely based on code generated by an AI or LLM tag: no release notes Changes to exclude from release notes type: bug Bug report and fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant